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Field of the Invention 

The present invention is in the area of software application 
10 development and pertains particularly to methods and apparatus for 

reducing data traffic associated with a voice XML application distribution 
system through cache optimization. 
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Cross-Reference to Related Documents 



The present invention claims priority as a continuation in part of a 
20 U.S. patent application, serial number 10/190,080, entitled 'Method and 
Apparatus for Improving Voice recognition performance in a voice 
application distribution system" filed on 07/02/2002, which is a 
continuation in part of U.S. patent application serial number 10/173,333, 
entitled "Method for Automated Harvesting of Data from A Web site 
25 using a Voice Portal System", filed on 06/14/2002, which claims priority 
to provisional application serial number 60/302,736. The instant application 
claims priority to the above mentioned applications in their entirety by 
reference. 



30 



-2- 



Back2round of the Invention 

A speech application is one of the most challenging applications to 
5 develop, deploy and maintain in a communications (typically telephony) 
environment. Expertise required for developing and deploying a viable 
application includes expertise in computer telephony integration (CTI) 
hardware and software, voice recognition software, text-to-speech software, 
and speech application logic. 

10 With the relatively recent advent of voice extensive markup language 

(VXML) the expertise require to develop a speech solution has been reduced 
somewhat. VXML is a language that enables a software developer to focus 
on the application logic of the voice application without being requked to 
configuring underlying telephony components. Typically, the developed 

15 voice application is run on a VXML interpreter that resides on and executes 
on the associated telephony system to deliver the solution. 

As is shown in Fig. 1 A (prior art) a typical architecture of a VXML- 
compliant telephony system comprises a voice application server (1 10) and a 
VXML-compliant telephony server (130). Typical steps for development 

20 and deployment of a VXML enabled IVR solutions are briefly described 
below using the elements of Fig. 1 A. 

Firstly, a new application database (113) is created or an existing one 
is modified to support VXML. Application logic 1 12 is designed in terms of 
workflow and adapted to handle the routing operations of the IVR system. 

25 VXML pages, which are results of fiinctioning application logic, are 

rendered by a VXML rendering engine (111) based on a specified generation 
sequence. 
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Secondly, an object facade to server 130 is created comprising the 
corresponding VXML pages and is sent to server 130 over a network (120), 
which can be the Internet, an Intranet, or an Ethernet network. The VXML 
pages are integrated into rendering engine 1 1 1 such that they can be 
5 displayed according to set workflow at server 110. 

Thirdly, the VXML-telephony server 130 is configured to enable 
proper retrieval of specific VXML pages from rendering engine 111 within 
server 110. A triggering mechanism is provided to server 110 so that when 
a triggering event occurs, an appropriate outbound call is placed from server 
10 110. 

A VXML interpreter (13 1), a voice recognition text-to-speech 
engine (132), and the telephony hardware/software (133) are provided 
within server 130 and comprise server function. In prior art, the telephony 
hardware/software 130 along with the VXML interpreter 131 are packaged 

15 as an off-the-shelf I VR-enabling technology. Arguably the most important 
feature, however, of the entire system is the application server 110. The 
application logic (1 12) is typically written in a programming language such 
as Java and packaged as an enterprise Java Bean archive. The presentation 
logic required is handled by rendering engine 1 1 1 and is written in JSP or 

20 PERL. 

An enhanced voice application system is known to the inventor and 
disclosed in the U.S. patent application entitled ''Method and Apparatus for 
Development and Deployment of a Voice Software Application for 
Distribution to one or more Application Consumers'^ to which this 
25 application claims priority. That system uses a voice application server that 
is connected to a data network for storing and serving voice applications. 
The voice application server has a data connection to a network 
communications server connected to a communications network such as the 
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well-known PSTN network. The communication server routes the created 
voice applications to their intended recipients, 

A computer station is provided as part of the system and is 
connected to the data network and has access to the voice apphcation server. 
5 A chent software apphcation is hosted on the computer station for the 

purpose of enabling users to create applications and manage their states. In 
this system, the user operates the client software hosted on the computer 
station in order to create voice applications through object modeling and 
linking. The applications, once created, are then stored in the application 

10 server for deployment. The user can control and manage deployment and 
state of deployed applications includmg scheduled deployment and repeat 
deployments in terms of intended recipients. 

In one embodiment, the system is adapted for developing and 
deploying a voice application using Web-based data as source data over a 

15 communications network to one or more recipients. The enhanced system 
has a voice application server capable through software and network 
connection of accessing a network server and Web site hosted therein and 
for pulling data from the site. The computer station running a voice 
application software has control access to at least the voice application 

20 server and is also capable of accessing the network server and Web site. An 
operator of the computer station creates and provides templates for the 
voice apphcation server to use in data-to-voice rendering. In this aspect, 
Web data can be harvested from a Web-based data source and converted to 
voice for delivery as dialogue in a voice application. 

25 In another embodiment, a method is available in the system described 

above for organizing, editing, and prioritizing the Web-based data before 
dialog creation is performed. The method includes harvesting the Web- 
based data source in the form of its original structure; generating an object 
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tree representing the logical structure and content type of the harvested, 
Web-based data source; manipulating the object tree generated to a desired 
hierarchal structure and content; creating a voice application template in 
VXML and populating the template with the manipulated object tree; and 
5 creating a voice application capable of accessing the Web-based data source 
according to the constraints of the template. The method allows 
streamlining of voice application deployment and executed state and 
simplified development process of the voice application. 

A security regimen is provided for the above-described system. The 

10 protocol provides transaction security between a Web server and data and a 
voice portal system accessible through a telephony network on the user end 
and through an XML gateway on the data source end. The regimen includes 
one of a private connection, a virtual private network, or a secure socket 
layer, set-up between the Web server and the Voice Portal system through 

15 the XML gateway. Transactions carried on between the portal and the 
server or servers enjoy the same security that is available between secure 
nodes on the data network. In one embodiment, the regimen further 
includes a voice translation system distributed at the outlet of the portal and 
at the telephone of the end user wherein the voice dialog is translated to an 

20 obscure language not that of the users language and then retranslated to the 
users language at the telephone of the user. 

In such as system where templates are used to enable voice 
application dialog transactions, voice application rules and voice recognition 
data are consulted for the appropriate content interpretation and response 

25 protocol so that the synthesized voice presented as response dialog through 
the voice portal to the user is both appropriate in content and hopefully error 
free in expression. The database is therefore optimized with vocabulary 
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words that enable a very wide range of speech covering many different 
vocabulary words akin to many differing business scenarios. 

According to yet another aspect of the invention, vocabulary 
recognition is tailored for active voice applications according to client 
5 parameters. This is accomplished through a vocabulary management system 
adapted to constrain voice recognition processing associated with text-to- 
speech and speech-to-text rendering associated with use of an active voice 
application in progress between a user accessing a data source through a 
voice portal The enhancement includes a vocabulary management server 

10 connected to a voice application server and to a telephony server, and an 
instance of vocabulary management software running on the management 
server for enabling vocabulary establishment and management for voice 
recognition software. In practice of the enhanced vocabulary management 
capability, an administrator accessing the vocabulary management server 

1 5 uses the vocabulary management software to create unique vocabulary sets 
or lists that are specific to selected portions of vocabulary associated with 
target data sources the vocabulary sets differing in content according to 
administrator direction. 

It will be appreciated by one with skill in the art of voice application 

20 deployment architecture that many users vying to connect and interact with a 
voice portal may in some cases create a bottleneck wherein data lines 
connecting voice application components to Web-sources and other data 
sources become taxed to their capacities. This problem may occur especially 
at peak use periods as is common for many normal telephony envu-onments. 

25 It has occurred to the inventor that still more streamlining in terms of traffic 
optimization is required to alleviate potential line use issues described above. 

Therefore, what is clearly needed is a method and apparatus for 
dynamic optimization of local cache components in a VXML distribution 
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system, especially between an application server and a voice portal. Such a 
system would improve data carrying efficiency over critical data lines and 
improve response time at the voice portal. 

5 

Summary of the Invention 

In a preferred embodiment of the invention, in a voice-extensible- 
markup-language-enabled voice-application deployment architecture, an 

10 application logic for determining which portions of a voice application for 
deployment should be cached at an application-receivmg end system or 
systems is provided, comprising a processor for processing the voice 
application according to sequential dialog files of the application, a static 
content optimizer connected to the processor for identifying files containing 

15 static content, and a dynamic content optimizer connected to the processor 
for identifying files containing dynamic content. The application is 
characterized in that the optimizers determine which files should be cached 
at which end-system facilities, tag the files accordingly, and prepare those 
files for distribution to selected end-system cache facilities for local retrieval 

20 during consumer interaction with the deployed application. 

In preferred embodiments the static and dynamic optimizers are 
software routines. Also in preferred embodiments the static and dynamic 
optimizers are firmware components embedded into the processor. Also in 
preferred embodiments the processor is a dialog runtime processor dedicated 

25 to processing subsequent dialogs of a voice application. Further, the 
deployment architecture may include an application server and a voice 
portal. 
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In some preferred embodiments the dynamic optimizer identifies 
dynamic content according to a determination of non-recurring menu dialog 
and non-recurring result dialog fetched as a result of consumer interaction 
with the voice application. Further, the cache facility at the end system may 
5 be a telephony server cache. In other cases the cache facility at the end 
system may be a Web controller cache. 

In some embodiments the file tagging is accomplished using HTTP 
1 . 1 resource tagging. In some cases dynamic tagging by the dynamic 
optimizer uses results from statistical analysis to determine which files to tag 
10 for distribution to an end-system cache. In some embodiments dynamic 

optimization continues after application deployment, the continued dynamic 
tagging relying on changing statistical probability results. 

In another aspect of the invention a system for creating and 
15 distributing interactive voice applications to end users is provided, 

comprising a voice application server, a voice application, a voice portal, and 
a network for delivery. The system is characterized in that the voice 
application determines which dialog files of a finished voice application will 
be cached locally at the voice portal for subsequent local retrieval during 
20 end-user interaction with the application. 

In preferred embodiments of the voice application has a static and 
dynamic optimizer connected to a dialog runtime processor, the optimizers 
cooperating locally to tag and prepare cacheable content of the voice 
application for caching and subsequent retrieval fi*om the voice portal. Also 
25 in preferred embodiments the network for delivery is a telephony network. 

In still other embodiments the network for delivery is a data network. In yet 
other embodiments the delivery network is a combination of a data network 
and a telephony network the application delivered through a network bridge. 
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In some cases the static and dynamic optimizers are firmware 
components embedded into the processor. Also, the dialog runtime 
processor maybe dedicated to processing subsequent dialogs of a voice 
application. In yet other embodiments the dynamic optimizer identifies 
5 dynamic content according to a determination of non-recurring menu dialog 
and non-recurring result dialog fetched as a result of consumer interaction 
with the voice application. 

In some preferred embodiments voice portal includes a telephony 
server and cache. There may also be a Web controller and cache. In some 
1 0 cases the static and dynamic optimizers tag files determined to be cacheable 
according to HTTP 1.1 regimen. 

In yet other embodiments dynamic tagging by the dynamic optimizer 
uses statistical analysis to determine which files to tag for distribution to an 
1 5 end-system cache. Also in other embodiments dynamic optimization 
continues after application deployment, the continued dynamic tagging 
relying on changing statistical probability results. 

In yet another aspect of the invention a method for identifying 
specific dialog files of a voice application for local file caching at targeted 
end systems, the application pending deployment from a voice application 
server and deploying the selected files to the targeted cache systems for local 
retrieval during voice appUcation interaction is provided, comprising steps of 
(a) running the voice application at the voice application server; (b) 
identifying static dialogs of the application and tagging them appropriately; 
25 (c) identifying dynamic dialogs of the application and tagging them 

appropriately; (d) deploying the static and dynamic dialog files identified and 
tagged to selected target cache systems; and (e) retrieving, at the end 
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systems, the tagged files fi-om local cache to play in real time and in proper 
order with the deployed voice application. 

In preferred embodiments of this method, in step (a), the application 
is run on a runtime processor connected to a rules engine. Also in preferred 
embodiments, in step (b), the static dialogs are identified and tagged by a 
static optimizer routine connected to the processor. In other preferred 
embodiments, in step (c), the dynamic dialogs are identified and tagged by a 
dynamic optimizer routine connected to the processor. 

In some embodiments, in steps (b) and (c), tagging is accomplished 
using HTTP 1 . 1 regimen. Also in some embodiments, in step (d), the 
selected files are deployed ahead of the voice application, the deployed 
application, when deployed, missing the selected files. In still other 
embodiments, in step (d), the selected files are deployed with the voice 
application and saved to the local cache systems at a first interaction with the 
deployed application. In yet other embodiments, in step (c), dynamic dialogs 
include dynamic menus and dynamic data results fetched as a result of menu 
interaction. 



Brief Description of the Drawing Figures 

Fig. lA is a block diagram illustrating a basic architecture of a 
VXML-enabled IVR development and deployment environment according to 
prior-art. 

Fig. IB is a block diagram illustrating the basic architecture of Fig. 
1 A enhanced to practice the present invention. 
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Fig. 2 is a process flow diagram illustrating steps for creating a voice 
application shell or container for a VXML voice application according to an 
embodiment of the present invention. 

Fig. 3 is a block diagram illustrating a simple voice application 
container according to an embodiment of the present invention. 

Fig. 4 is a block diagram illustrating a dialog object model according 
to an embodiment of the present invention. 

Fig. 5 is a process flov^ diagram illustrating steps for voice dialog 
creation for a VXML-enabled voice application according to an embodiment 
of the present invention. 

Fig. 6 is a block diagram illustrating a dialog transition flow after 
initial connection with a consumer according to an embodiment of the 
present invention. 

Fig. 7 is a plan view of a developer's fi-ame containing a developer's 
login screen of according to an embodiment of the present invention. 

Fig. 8 is a plan view of a developer's frame containing a screen shot 
of a home page of the developer's platform interface of Fig. 7. 

Fig. 9 is a plan view of a developer's frame containing a screen shot 
of an address book 91 1 accessible through interaction with the option 
Address in section 803 of the previous frame of Fig. 8. 

Fig. 10 is a plan view of a developer's frame displaying a screen 1001 
for creating a new voice application. 

Fig. 1 1 is a plan view of a developer's frame illustrating screen of 
Fig. 10 showing further options as a result of scrolling down. 

Fig. 12 is a screen shot of a dialog configuration window illustrating 
a dialog configuration page according to an embodiment of the invention. 

Fig. 13 is a screen shot 1300 of dialog design panel of Fig. 12 
illustrating progression of dialog state to a subsequent contact. 
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Fig. 14 is a screen shot of a thesaums configuration window 
activated from the example of Fig. 13 according to a preferred embodiment. 

Fig. 15 is a plan view of a developer's fi'ame illustrating a screen for 
managing created modules according to an embodiment of the present 
5 invention. 

Fig. 16 is a block diagram of the dialog transition flow of Fig. 6 
enhanced for Web harvesting according to an embodiment of the present 
invention. 

Fig. 17 is a block diagram of the voice application distribution 

10 environment of Fig. IB illustrating added components for automated Web 
harvesting and data rendering according to an embodiment of the present 
invention. 

Fig. 18 is a block diagram illustrating a Web-site logical hierarchy 
harvested and created as an object model. 
15 Fig. 19 is a block diagram illustrating the model of Fig. 18 being 

manipulated to simplify the model for economic rendering. 

Fig. 20 is a process flow diagram illustrating intermediary steps for 
reducing complexity of a Web-site logical tree. 

Fig. 21 is a block diagram illustrating a secure connectivity between a 
20 Voice Portal and a Web server according to an embodiment of the invention. 

Fig. 22 is a block diagram illustrating the architecture of Fig. IB 
enhanced with a vocabulary management server and software according to 
an embodiment of the present invention. 

Fig. 23 is a block diagram illustrating various functional components 
25 of a VMXL application architecture including cache optimization 
components according to an embodiment of the present invention. 

Fig. 24 is a process flow diagram illustrating steps for practice of the 
present invention. 
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Description of the Preferred Embodiments 

According to preferred embodiments of the present invention, the 
5 inventor teaches herein, in an enabling fashion, a novel system for developing 
and deploying real-time dynamic or static voice applications in an object- 
oriented way that enables inbound or outbound delivery of IVR and other 
interactive voice solutions in supported communications environments. 
Fig. lA is a block diagram illustrating a basic architecture of a 

10 VXML-enabled IVR development and deployment environment according to 
prior art. As described with reference to the background section, the prior- 
art architecture of this example is known to and available to the inventor. 
Developing and deploying voice applications for the illustrated environment, 
which in this case is a telephony environment, requires a very high level of 

15 skill in the art. Elements of this prior-art example that have already been 

introduced with respect to the background section of this specification shall 
not be re-introduced. 

In this simplified scenario, voice application server 1 10 utilizes 
database/resource adapter 1 13 for accessing a database or other resources 

20 for content. Application logic 1 12 comprising VXML script, business rules, 
and underlying telephony logic must be carefully developed and tested before 
single applications can be rendered by rendering engine 111. Once voice 
applications are complete and servable from server 110, they can be 
deployed through data network 120 to telephony server 130 where 

25 interpreter 131 and text-to speech engine 132 are utilized to formulate and 
deliver the voice application in useable or playable format for telephony 
software and hardware 133. The applications are accessible to a receiving 
device, illustrated herein as device 135, a telephone, through the prevailing 
network 134, which is in this case a public-switched-telephone-network 
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(PSTN) linking the telephony server to the consumer (device 135) generally 
through a telephony switch (not shown). 

Improvements to this prior-art example in embodiments of the 
present invention concern and are focused in the capabilities of application 
5 server 110 with respect to development and deployment issues and with 
respect to overall enhancement to response capabilities and options in 
interaction dialog that is bi-directional Using the description of existing 
architecture deemed state-of-art architecture, the inventor herein describes 
additional components that are not shown in the prior-art example of Fig. 
10 1 A, but are illustrated in a novel version of the example represented herein 
by Fig. IB. 

Fig. IB is a block diagram illustrating the basic architecture of Fig. 
1 A enhanced to illustrate an embodiment of the present invention. Elements 
of the prior-art example of Fig. 1 A that are also illustrated in Fig. IB retain 

15 their original element numbers and are not re-introduced. For reference 
purposes an entity (a person) that develops a voice application shall be 
referred to hereinafter in this specification as either a producer or developer. 

A developer or producer of a voice application according to an 
embodiment of the present invention operates preferably from a remote 

20 computerized workstation illustrated herein as station 140. Station 140 is 
essentially a network-connected computer station. Station 140 may be 
housed within the physical domain also housing application server 110. In 
another embodiment, station 140 and application server 110 may reside in 
the same machine. In yet another embodiment, a developer may operate 

25 station 140 from his or her home office or from any network-accessible 
location including any wireless location. 

Station 140 is equipped with a client software tool (CL) 141, which 
is adapted to enable the developer to create and deploy voice applications 
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across the prevailing system represented by servers 110, 130, and by 
receiving device 135. CL 141 is a Web interface application similar to or 
incorporated with a Web browser application in this example, however 
other network situations may apply instead. CL 141 contains the software 
5 tools required for the developer to enable enhancements according to 

embodiments of the invention. Station 140 is connected to a voice portal 
143 that is maintained either on the data network (Internet, Ethernet, 
Intranet, etc.) and/or within telephony network 134. In this example portal 
143 is illustrated logically in both networks. Voice portal 143 is adapted to 
10 enable a developer or a voice application consumer to call in and perform 
functional operations (such as access, monitor, modify) on selected voice 
applications. 

Within application server 1 10 there is an mstance of voice application 
development server 142 adapted in conjunction with the existing 

15 components 1 1 1-1 13 to provide dynamic voice application development and 
deployment according to embodiments of the invention. 

Portal 143 is accessible via network connection to station 140 and 
via a network bridge to a voice application consumer through telephony 
network 134. In one example, portal 143 is maintained as part of application 

20 server 110. Portal 143 is, in addition to an access point for consumers is 

chiefly adapted as a developer's interface server. Portal 143 is enabled by a 
SW instance 144 adapted as a server instance to CL 141. In a telephony 
embodiment, portal 143 may be an interactive voice response (IVR) unit. 
In a preferred embodiment, the producer or developer of a voice 

25 application accesses application server 110 through portal 143 and data 

network 120 using remote station 140 as a "Web interface" and first creates 
a list of contacts. In an alternative embodiment, station 140 has direct 
access to application server 1 10 through a network interface. Contacts are 
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analogous to consumers of created voice applications. CL 141 displays, 
upon request and in order of need, all of the required interactive interfaces 
for designing, modifying, instantiating, and executing completed voice 
applications to launch from application server 110 and to be delivered by 

5 server 130. 

The software of the present invention enables voice applications to 
be modeled as a set of dialog objects having business and telephony (or other 
communication delivery/access system) rules as parameters without requiring 
the developer to perform complicated coding operations. A dialog template 

10 is provided for modeling dialog states. The dialog template creates the 

actual speech dialog, specifies the voice application consumer (recipient) of 
the dialog, captures the response from the voice application consumer and 
performs any follow-up actions based upon system interpretation of the 
consumer response. A dialog is a reusable component and can be linked to a 

15 new dialog or to an existing (stored) dialog. A voice application is a set of 
dialogs inter-linked by a set of business rules defmed by the voice application 
producer. Once the voice application is completed, it is deployed by server 
1 10 and is eventually accessible to the authorized party (device 135) through 
telephony server 130. 

20 The voice applications are in a preferred embodiment in the form of 

VXML to run on VXML-compliant telephony server 130. This process is 
enabled through VXML rendering engine 111. Engine 1 1 1 interacts directly 
with server 130, locates the voice application at issue, retrieves its voice 
application logic, and dynamically creates the presentation in VXML and 

25 forwards it to server 130 for processing and delivery. Once interpreter 131 
interprets the VXML presentation it is sent to or accessible to device 135 in 
the form of an mteractive dialog (in this case an IVR dialog). Any response 
from device 135 follows the same path back to application server 1 10 for 
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interpretation by engine 111. Server 110 then retrieves the voice application 
profile fi-om the database accessible through adapter 1 13 and determines the 
next business rule to execute locally. Based upon the determination a 
corresponding operation associated with the rule is taken. A next (if 
required) VXML presentation is then forwarded to rendering engme 111, 
which in turn dynamically generates the next VXIVCL page for interpretation, 
processing and deployment at server 130. This two-way interaction between 
the VXML-compliant telephony server (130) and the voice application 
server (110) continues in the form of an automated logical sequence of 
VXML dialogs until the voice application fmally reaches its termination 
state. 

A voice application (set of one or more dialogs) can be delivered to 
the consumer (target audience) in outbound or inbound fashion. For an 
inbound voice application, a voice application consumer calls in to voice 
portal 143 to access the inbound voice application served from server 130. 
The voice portal can be mapped to a phone number directly or as an 
extension to a central phone number. In a preferred embodiment the voice 
portal also serves as a community forum where voice application producers 
can put their voice applications into groups for easy access and perform 
operational activities such as voice application linkmg, reporting, and text- 
to-speech recording and so on. 

For an outbound voice application there are two sub-types. These 
are on-demand outbound applications and scheduled outbound applications. 
For on-demand outbound applications server 1 10 generates an outbound call 
as soon as the voice application producer issues an outbound command 
associated with the application. The outbound call is made to the target 
audience and upon the receipt of the call the voice application is launched 
from server 130. For scheduled outbound applications, the schedule server 
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(not shown within server 110) launches the voice application as soon as the 
producer-specified date and time has arrived. In a preferred embodiment 
both on-demand and scheduled outbound application deployment functions 
support unicast, muhicast, and broadcast delivery schemes. 

5 As described above, a voice application created by application server 

110 consists of one or more dialogs. The contents of each dialog can be 
static or dynamic. Static content is content sourcing fi-om the voice 
application producer. The producer creates the contents when the voice 
application is created. Dynamic content sources from a third-party data 

10 source. 

In a preferred embodiment a developers tool contains an interactive 
dialog design panel (described in detail later) wherein a producer inputs a 
reference link in the form of extensible Markup Language (XML) to the 
dialog description or response field. When a dialog response is executed and 

1 5 interpreted by application server 1 10, the reference link invokes a resource 
Application-Program-Interface (API) that is registered in resource adapter 
113. The API goes out in real time and retrieves the requested data and 
integrates the returned data into the existing dialog. The resulting and 
subsequent VXML page being generated has the dynamic data embedded 

20 onto it. 

One object of the present invention is a highly dynamic, real time 
IVR system that tailors itself automatically to the application developer's 
specified data source requirement. Another object of the present invention is 
to enable rapid development and deployment of a voice application without 
25 requirement of any prior knowledge of VXML or any other programming 
technologies. A further object of the present mvention is to reduce the 
typical voice application production cycle and drastically reduce the cost of 
production. 
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Fig. 2 is a process flow diagram illustrating steps for creating a voice 
application shell or container for a VXML voice application according to an 
embodiment of the present invention. A developer utilizing a client 
application known as a thin client analogous to CL 141 on station 140 
5 described with reference to Fig. lb, creates a voice application shell or voice 
application container. At step 201 the developer logs in to the system at a 
login page. At step 202 the developer creates a contact list of application 
consumers. Typically a greeting or welcome page would be displayed before 
step 202. An application consumer is an audience of one or more entities 

10 that would have access to and interact with a voice application. A contact 
list is first created so that all of the intended contacts are available during 
voice application creation if call routing logic is required later on. The 
contact list can either be entered individually in the event of more than one 
contact by the producer or may be imported as a set list from some 

15 organizer/planner software, such as Microsoft Outlook™ or perhaps a 
PDA™ organizer. 

In one embodiment of the present invention the contact list may 
reside on an external device accessed by a provided connector (not shown) 
that is configured properly and adapted for the purpose of accessing and 

20 retrieving the list. This approach may be used, for example, if a large, 

existing customer database is used. Rather than create a copy, the needed 
data is extracted from the original and provided to the application. 

At step 203, a voice application header is populated. A voice 
application header is simply a title field for the application. The field 

25 contains a name for the application and a description of the application. At 
step 204, the developer assigns either and inbound or outbound state for the 
voice application. An outbound application is delivered through an 
outbound call while the consumer accesses an inbound voice application. 
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In the case of the inbound application, in step 205 the system sets a 
default addressee for inbound communications. The developer selects a 
dialog from a configured list in step 206. It is assumed in this example that 
the dialogs have already been created. At step 207, the developer executes 
5 the dialog and it is deployed automatically. 

In the case of an outbound designation in step 204, the developer 
chooses a launch type in step 208. A launch type can be either an on- 
demand type or a scheduled type. If the choice made by the developer in 
step 208 is scheduled, then in step 209, the developer enters all of the 
10 appropriate time and date parameters for the launch including parameters for 
recurring launches of the same application. In the case of an on demand 
selection for application launch in step 208, then in step 210 the developer 
selects one or more contacts from the contact list established in step 202. It 
is noted herein that step 210 is also undertaken by the developer after step 
1 5 209 in the case of a scheduled launch. At step 207, the dialog is created. In 
this step a list of probable dialog responses for a voice apphcation wherein 
interaction is intended may also be created and stored for use. 

In general sequence, a developer creates a voice application and 
integrates the application with a backend data source or, optionally, any third 
20 party resources and deploys the voice application. The application consumer 
then consumes the voice application and optionally, the system analyzes any 
consumer feedback collected by the voice application for fiirther interaction 
if appropriate. The steps of this example pertain to generating and launching 
a voice application from "building blocks" that are akeady in place. 
25 Fig. 3 is a block diagram illustrating a simple voice application 

container 300 according to an embodiment of the present invention. 
Application container 300 is a logical container or "voice application object" 
300. Also termed a shell, container 300 is logically illustrated as a possible 
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result of the process of Fig. 2 above. Container 300 contains one or more 
dialog states illustrated herein as dialogs 301 a-n labeled in this example as 
dialogs 1-4. Dialogs 301a-n are objects and therefore container 300 is a 
logical grouping of the set of dialog objects 301 a-n. 
5 The represented set of dialog objects 301 a-n is interlinked by 

business rules labeled rules 1-4 in this example. Rules 1-4 are defined by the 
developer and are rule objects. It is noted herein that that there may be 
many more or fewer dialog objects 301 a-n as well as interlinking business 
rule objects 1-4 comprising container object 300 without departing from the 

10 spirit and scope of the present invention. The inventor illustrates 4 of each 
entity and deems the representation sufficient for the purpose of explaining 
the present invention. 

In addition to the represented objects, voice application shell 300 
includes a plurality of settings options. In this example, basic settings 

15 options are tabled for reference and given the element number 305 a-c 
illustrating 3 listed settings options. Reading in the table from top to 
bottom, a first setting launch type (305a) defmes an initial entry point for 
voice application 300 into the communications system. As described above 
with reference to Fig. 2 step 204, the choices for launch type 305a are 

20 inbound or outbound. In an alternative embodiment, a launch type may be 

defined by a third party and be defined in some other pattern than inbound or 
outbound. 

Outbound launch designation binds a voice application to one or 
more addressees (consumers). The addressee may be a single contact or a 
25 group of contacts represented by the contact list or distribution list also 
described with reference to Fig. 2 above (step 202). When the outbound 
voice application is launched in this case, it is delivered to the addressee 
designated on a voice application outbound contact field (not shown). All 
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addressees designated receive a copy of the outbound voice application and 
have equal opportunity to interact (if allowed) with the voice application 
dialog and the corresponding backend data resources if they are used in the 
particular application. 

In the case of an inbound voice application designation for launch 
type 305a, the system instructs the application to assume a ready stand-by 
mode. The application is launched when the designated voice application 
consumer actively makes a request to access the voice application. A typical 
call center IVR system assumes this type of inbound application. 

Launch time setting (305b) is only enabled as an option if the voice 
application launch type setting 305a is set to outbound. The launch time 
setting is set to instruct a novel scheduling engine, which may be assumed to 
be part of the application server function described with reference to Fig. 
IB. The scheduling engine controls the parameter of when to deliver of 
when to deliver the voice application to the designated addressees. The time 
setting may reflect on-demand, scheduled launch, or any third-party-defmed 
patterns. 

On-demand gives the developer full control over the launch time of 
the voice application. The on-demand feature also allows any third-party 
system to issue a trigger event to launch the voice application. It is noted 
herein that in the case of third-party control the voice application interaction 
may transcend more than one communications system and or network. 

Property setting 305c defines essentially how the voice application 
should behave in general. Possible state options for setting 305c are public, 
persistent, or sharable. A public state setting indicates that the voice 
application should be accessible to anyone within the voice portal domain so 
that all consumers with minimum privilege can access the application. A 
persistent state setting for property 305c ensures that only one copy of the 
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voice application is ever active regardless of how many consumers are 
attempting to access the application. An example of such a scenario would 
be that of a task-allocation voice application. For example, in a task- 
allocation scenario there are only a number of time slots available for a user 

5 to access the application. If the task is a request firom a pool of contacts 
such as perhaps customer-support technicians to lead a scheduled chat 
session, then whenever a time slot has been selected, the other technicians 
can only select the slots that are remaining. Therefore if there is only one 
copy of the voice application circulating within the pool of technicians, the 

10 application captures the technician's response on a first-come first-serve 
basis. 

A sharable application state setting for property 305a enables the 
consumer to "see" the responses of other technicians in the dialog at issue, 
regardless of whether the voice application is persistent or not. Once the 
1 5 voice application shell is created, the producer can then create the first 

dialog of the voice application as described with reference to Fig. 2 step 207. 
It is reminded herein that shell 300 is modeled using a remote and preferably 
a desktop client that will be described in more detail later in this 
specification. 

20 Fig. 4 is a block diagram illustrating a dialog object model 400 

according to an embodiment of the present invention. Dialog object model 
400 is analogous to any of dialog objects 301a-n described with reference to 
Fig. 3 above. Object 400 models a dialog and all of its properties. A 
properties object illustrated within dialog object 400 and labeled Object 

25 Properties (410) contains the dialog type and properties including behavior 
states and business rules that apply to the dialog. 

For example, every dialog has a route-to property illustrated in the 
example as Route To property (41 1). Property 41 1 maps to and identifies 
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the source of the dialog. Similarly, every dialog has a route-from property 
illustrated herein as Route From property (412). Route from property 412 
maps to and identifies the recipient contact of the dialog or the dialog 
consumer. 

Every dialog falls under a dialog type illustrated in this example by a 
property labeled Dialog Type and given the element number 413. Dialog 
type 413 may include but is not limited to the following types of dialogs: 

1 . Radio Dialog : A radio dialog allows a voice application consumer to 
interactively select one of available options from an option list after 
hearing the dialog description. 

2. Bulletin Dialog : A bulletin dialog allows a voice application 
consumer to interact with a buUetin board-like forum where multiple 
consumers can share voice messages in an asynchronous manner. 

3. .Statement Dialog: A statement dialog plays out a statement to a 
voice application consumer without expecting any responses from 
the consumer. 

4. OppnRntrv Dialog : An open entry dialog allows a voice application 
consumer to record a message of a pre-defmed length after hearing 

the dialog description. 

5. Third Party Dialog : A third party dialog is a modular container 
structure that allows the developer to create a custom-made dialog 
type with its own properties and behaviors. An example would be 
Nuance's SpeechObject™. 

Each dialog type has one or more associated business rules tagged to 
it enabling determination of a next step in response to a perceived state. A 
rule compares the application consumer response with an operand defmed by 
the application developer using an operational code such as less than, greater 
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than, equal to, or not equal to. In a preferred embodiment of the invention 
the parameters surrounding a rule are as follows: 

If user response is equal to the predefined value, then perform one 

of the following: 
5 A, Do nothing and terminate the dialog state. 

B. Do a live bridge transfer to the contact specified. Or, 
C Send another dialog to another contact. 

In the case of an outbound voice application, there are likely to be 

exception-handling business rules associated with perceived states. In a 

10 preferred embodiment of the present invention, exception handling rules are 

encapsulated into three different events: 

1 . An application consumer designated to receive the voice application 
rejects a request for interacting with the voice application. 

2. An application consumer has a busy connection at the time of launch 
1 5 of the voice application, for example, a telephone busy signal. And. 

3 . An application consumer's connection is answered by or is redirected 
to a non-human device, for example, a telephone answering machine. 
For each of the events above, any one of the three follow-up actions 

are possible according to perceived state: 
20 1. Do nothing and terminate the dialog state. 

2. Redial the number. 

3. Send another dialog to another contact. 

Fig. 5 is a process flow diagram illustrating steps for voice dialog 
25 creation for a VXML-enabled voice application according to an embodiment 
of the present invention. All dialogs can be reused for subsequent dialog 
routing. There is, as previously described, a set of business rules for every 
dialog and contact pair. A dialog be active and be able to transit from one 
dialog state to another only when it is rule enabled. 
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At step 501 a developer populates a dialog description field with a 
dialog description. A dialog description may also contain reference to XML 
tags as will be described further below. At step 502, parameters of the 
dialog type are entered based on the assigned type of dialog. Examples of 
5 the available parameters were described with reference to Fig. 4 above. 

At step 503 the developer configures the applicable business rules for 
the dialog type covering, as well, follow up routines. In one embodiment 
rules configuration at step 503 resoh^es to step 505 for determining follow- 
up routines based on the applied rules. For example, the developer may 
10 select at step 505, one of three types of transfers. For example, the 

developer may configure for a live transfer as illustrated by step 506; transfer 
to a next dialog for creation as illustrated by step 507; or the developer may 
configure for dialog completion as illustrated by step 508. 

If the developer does not branch off into configuring sub-routines 
15 506, 507, or 508 from step 505, but rather continues fi-om step 503 to step 
504 wherein inbound or outbound designation for the dialog is system 
assigned, then the process must branch from step 504 to either step 508 or 
509, depending on whether the dialog is inbound or outbound. If at step 
504, the dialog is inbound, then at step 508 the dialog is completed. If the 
20 assignment at step 504 is outbound, then at step 509 to configure call 
exception business rules. 

At step 510, the developer configures at least one foUow-up action 
for system handling of exceptions. If no follow-up actions are required to be 
specified at step 510, then the process resolves to step 508 for dialog 
25 completion. If an action or actions are configured at step 5 1 0, then at step 
5 1 1 the action or actions are executed such as a system re-dial, which the 
illustrated action for step 511. 
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In a preferred embodiment, once the voice application has been 
created, it can be deployed and accessed through the telephone. The method 
of access, of course, depends on the assignment configured at step 504. 
For example, if the application is inbound, the application consumer accesses 
5 a voice portal to access the application. As described further above, a voice 
portal is a voice interface for accessing a selected number of functions of the 
voice application server described with reference to Fig. IB above. A voice 
portal may be a connection-oriented-switched-telephony (COST) enabled 
portal or a data-network-telephony (DNT) enabled portal. In the case of 
10 an outbound designation at step 504, the application consumer receives the 
voice application through an incoming call to the consumer originated from 
the voice application server. In a preferred embodiment, the outbound call 
can be either COST based or DNT based depending on the communications 
environment supported. 
15 Fig. 6 is a block diagram illustrating a dialog transition flow after 

initial connection with a consumer according to an embodiment of the 
present invention. Some of the elements illustrated in this example were 
previously introduced with respect to the example of Fig. IB above and 
therefore shall retain their original element numbers. In this example, an 
20 application consumer is logically iUustrated as Application Consumer 600 

that is actively engaged in interaction with a dialog 601 hosted by telephony 
server 130. Server 130 is, as previously described a VXML compliant 
telephony server as is so labeled. 

Application server 1 10 is also actively engaged in the interaction 
25 sequence and has the capability to provide dynamic content to consumer 
600. As application consumer 600 begins to interact with the voice 
application represented herein by dialog 600 within telephony server 130, 
voice application server 1 10 monitors the situation. In actual practice, each 
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dialog processed and sent to server 130 for delivery to or access by 
consumer 600 is an atomic unit of the particular voice application being 
deployed and executed. Therefore dialog 601 may logically represent more 
than one single dialog. 

5 In this example, assuming more than one dialog, dialog 601 is 

responsible during interaction for acquiring a response from consumer 600. 
Arrows labeled Send and Respond represent the described interaction. 
When consumer 600 responds to dialog content, the response is sent back 
along the same original path to VXML rendering engine 111, which 

10 interprets the response and forwards the interpreted version to a provided 
dialog controller 604. Controller 604 is part of application logic 1 12 in 
server 1 10 described with reference to Fig. IB. Dialog controller 604 is a 
module that has the ability to perform table lookups, data retrieve and data 
write functions based on established rules and configured response 

15 parameters. 

When dialog controller 604 receives a dialog response, it stores the 
response corresponding to the dialog at issue (601) to a provided data 
source 602 for data mining operations and workflow monitoring. Controller 
604 then issues a request to a provided rules engine 603 to look-up the 

20 business rule or rules that correspond to the stored response. Once the 

correct business rule has been located for the response, the dialog controller 
starts interpretation. If the business rule accessed requires reference to a 
third-party data source (not shown), controller 604 makes the necessary data 
fetch from the source. Any data returned by controller 604 is integrated into 

25 the dialog context and passed onward VXML rendering engine 1 1 1 for 
dialog page generation of a next dialog 601 . The process repeats until 
dialog 601 is terminates. 
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In one embodiment, the business rule accessed by controller 604 as 
result of a received response from consumer 600 carries a dialog transition 
state other than bacic to the current application consumer. In this case 
controller 604 spawns an outbound call from application server 1 10 to 
5 deliver the next or "generated dialog" to the designated target application 
consumer. At the same time, the current consumer has his/her dialog state 
completed as described with reference to Fig. 5 step 508 according to 
predefined logic specified in the business rule. 

It will be apparent to one with skill in the art that a dialog can 
10 contain dynamic content by enabling controller 604 to have access to data 
source 602 according to rules served by rule engine 603. In most 
embodiments there are generally two types of dynamic content. Both types 
are, in preferred embodiments, structured in the form of XML and are 
embedded directly into the next generated dialog page. The first of the 2 
1 5 types of dynamic content is classified as non-recurring. Non-recurring 
content makes a relative reference to a non-recurring resource label in a 
resource adapter registry within a resource adapter analogous to adapter 113 
of voice appUcation server 1 10 described with reference to Fig. IB. 

In the above case, when dialog controller 604 interprets the dialog, it 
20 first scans for any resource label. If a match is found, it looks up the 

resource adapter registry and invokes the corresponding resource API to 
fetch the required data into the new dialog context. Once the raw data is 
returned from the third-party data source, it passes the raw data to a 
corresponding resource filter for further processmg. When completed in 
25 terms of processing by the fdter, the dialog resource label or tag is replaced 
with the filtered data and is integrated transparently into the new dialog. 

The second type of dynamic content is recurring. Recurring content 
usually returns more than one set of a name and value pair. An example 
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would be a list of stocks in an application consumer's stock portfolio. For 
example, a dialog that enables consumer 600 to parrot a specific stock and 
have the subsequent quote returned through another dialog state is made to 
use recurring dynamic content to achieve the desired result. Recurring 
content makes a relative reference to a recurring resource label in the 
resource adapter registry of voice application server 1 10. When controUer 
604 interprets the dialog, it handles the resource in an identical manner to 
handling of non-recurring content. However, instead of simply returning the 
filtered data back to the dialog context, it loops through the data list and 
configures each Usted item as a grammar-enabled keyword. In so doing, 
consumer 600 can parrot one of the items (separate stocks) in the list played 
in the first dialog and have the response captured and processed for return in 
the next dialog state. The stock-quote example presented below Ulustrates 
possible dialog/response interactions from the viewpoint of consumer 600. 
Voice Application: "Good morning Leo, what stock quote do you 

want?" 

Application Consumer: "Oracle" 

Voice Application: "Oracle is at seventeen dollars. " 

Voice Application: "Good morning Leo, what stock quote do you 

want?" 

This particular example consists of two dialogs. 

The first dialog plays out the statement "Good morning Leo, what 
stock quote do you want?" The dialog is followed by a waiting state that 
listens for keywords such as Oracle, Sun, Microsoft, etc. The statement 
consists of two dynamic non-recurring resource labels. The first one is the 
time in day: Good morning, good afternoon, or good evening. The second 
dynamic content is the name of the application consumer. In this case, the 
name of the consumer is internal to the voice application server, thus the 
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type of the resource label is SYSTEM. In the actual dialog description field, 

it may look something like this: 

<resource type=' ADAPTER' name='time greeting' /> <resource 
type=' SYSTEM' name='target_contact'/>, what stock quote do you 

5 want? 

Because the dialog is expecting the consumer to say a stock out of 
his/her existing portfoUo, the dialog type is radio dialog, and the expected 
response property of the radio dialog is 

<resource type=' ADAPTER' name='stock_list'> 

10 <param> 

<resource type=' SYSTEM' name='target_contact_id'/> 

</param> 
</resource> 

This XML resource label tells dialog controller 604 to look for a 
1 5 resource label named stockjist and to invoke the corresponding API with 
target_contactJd as the parameter. Upon completion of the data fetching, 
the list of stocks is integrated into the dialog as part of the grammars. And 
whatever the user responds to in terms of stock identification is matched 
against the grammars at issue (stocks in portfoUo) and assigned the grammar 
20 return value to the dialog response, which can then forward it to the next 
dialog as resource of DIALOG type. 

The producer can make reference to any dialog return values in any 
subsequent dialog by using <resource type='DIALOG' 
name='dialog_name'/>. This rule enables the producer to play out the 
25 options the application consumer selected previously in any foUow-up 
dialogs. 

The second dialog illustrated above plays out the quote of the stock 
selected from the first dialog, then returns the flow back to the first dialog. 
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Because no extra branching logic is involved in this dialog, the dialog type in 
this case is a statement dialog. The dialog's follow-up action is simply to 
forward the flow back to the first dialog. In such a case, the dialog 
statement is: <resource type='DIALOG' name='select stock dialog7> 
<resource type=' ADAPTER' name='get_stock_quote'> 
<parara> 

<resource type='DIALOG' name='select stock dialog'/> 

</param> 
</resource> 

Besides making reference to ADAPTER, DIALOG and SYSTEM 
type, the dialog can also take in other resource types such as SOUND and 
SCRIPT. SOUND can be used to impersonate the dialog description by 
inserting a sound clip into the dialog description. For example, to play a 
sound after the stock quote, the producer inserts <resource type='SOUND' 
1 5 name='beep'/> right after the ADAPTER resource tag. 

The producer can add a custom-made VXML script into the dialog 
description by using <resource type='RESOURCE' name='confu-m'/> so 
that in the preferred embodiment, any VXML can be integrated into the 
dialog context transparently with maximum flexibility and expandability. 
20 It will be apparent to one with skill in the art that whUe the example 

cited herein use VXML and XML as the mark-up languages and tags, it is 
noted herein that other suitable markup languages can be utilized in place of 
or integrated with the mentioned conventions without departing fi-om the 
spirit and scope of the invention. It will also be apparent to the skilled 
25 artisan that while the initial description of the invention is made in terms of a 
voice application server having interface to a telephony server using 
generally HTTP requests and responses, it should be noted that the present 
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invention can be practiced in any system that is capable of handling well- 
defined requests and responses across any distributed network. 

Figs. 7-15 illustrate various displayed Browser frames of a 
developer platform interface analogous to CL 141 of station 140 of Fig. IB. 
Description of the following interface frames and frame contents assumes 
existence of a desktop computer host analogous to station 140 of Fig. IB 
wherein interaction is enabled in HTTP request/response format as would be 
the case of developing over the Internet network for example. However, the 
following description should not limit the method and apparatus of the 
invention in any way as differing protocols, networks, interface designs and 
scope of operation can vary. 

Fig. 7 is a plan view of a developer's frame containing a developer's 
login screen of 700 according to an embodiment of the present invention. 
Frame 700 is presented to a developer in the form of a Web browser 
container according to one embodiment of the invention. Commercial Web 
browsers are well known and any suitable Web browser will support the 
platform. Frame 700 has all of the traditional Web options associated with 
most Web browser frames including back, forward, Go, File, Edit, View, and 
soon. A navigation tool bar is visible in this example. Screen 710 is a login 
page. The developer may, in one embodiment, have a developer's account. 
In another case, more than one developer may share a single account. There 
are many possibilities. 

Screen 710 has a field for inserting a login ID and a field for inserting 
a login personal identification number (PIN). Once login parameters are 
entered the developer submits the data by clicking on a button labeled Login. 
Screen 710 may be adapted for display on a desktop computer or any one of 
a number of other network capable devices following specified formats for 
display used on those particular devices. 
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Fig. 8 is a plan view of a developer's frame 800 containing a screen 
shot of a home page of the developer's platform interface of Fig. 7. Frame 
800 contains a sectioned screen comprising a welcome section 801, a 
product identification section 802 and a navigation section 803 combined to 
fill the total screen or display area. A commercial name for a voice 
application developer's platform that is coined by the inventor is the name 
Fonelet. Navigation section 803 is provided to display on the "home page" 
and on subsequent frames of the software tool. 

Navigation section 803 contains, reading from top to bottom, a 
plurality of useful links. Starting with a link to home followed by a link to an 
address book. A link for creating a new Fonelet (voice application) is 
labeled Create New. A link to "My" Fonelets is provided as well as a link to 
"Options". A standard Help link is illustrated along with a link to Logout. 
An additional "Options Menu" is the last illustrated link in section 803. 
Section 803 may have additional links that are visible by scrolling down with 
the provided scroll bar traditional to the type of display of this example. 

Fig. 9 is a plan view of a developer's frame 900 containing a screen 
shot of an address book 91 1 accessible through interaction with the option 
Address in section 803 of the previous frame of Fig. 8. Screen 911 as an 
interactive option for listing individual contacts and for listing contact lists. 
A contact list is a list of voice application consumers and a single contact 
represents one consumer in this example. However, in other embodiments a 
single contact may mean more than one entity. Navigation screen 803 is 
displayed on the left of screen 911. In this example, contacts are listed by 
First Name followed by Last Name, followed by a telephone number and an 
e-mail address. Other contact parameters may also be included or excluded 
without departing from the spirit and scope of the invention. For example 
the Web site of a contact may be listed and may also be the interface for 
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receiving a voice application. To the left of the listed contacts are interactive 
selection boxes used for selection and configuration purposes. Interactive 
options are displayed in the form of Web buttons and adapted to enable a 
developer to add or delete contacts. 

Fig. 10 is a plan view of a developer's frame 1000 displaying a screen 
1001 for creating a new voice application. Screen 1001 initiates creation of 
a new voice application termed a Fonelet by the inventor. A name field 1002 
is provided in screen 1001 for inputting a name for the application. A 
description field 1003 is provided for the purpose of entering the 
applications description. A property section 1004 is illustrated and adapted 
to enable a developer to select from available options listed as Public, 
Persistent, and Shareable by clicking on the appropriate check boxes. 

A Dialog Flow Setup section is provided and contains a dialog type 
section field 1005 and a subsequent field for selecting a contact or contact 
group 1006. After the required information is correctly populated into the 
appropriate fields, a developer may "create" the dialog by clicking on an 
interactive option 1007 labeled Create. 

Fig. 1 1 is a plan view of a developer's frame 1 100 illustrating screen 
1001 of Fig. 10 showing fiirther options as a result of scrolling down. A 
calling schedule configuration section 1 101 is illustrated and provides the 
interactive options of On Demand or Scheduled. As was previously 
described, selecting On Demand enables application deployment at the will 
of the developer while selecting scheduled initiates configuration for a 
scheduled deployment according to time/date parameters. A grouping of 
entry fields 1 102 is provided for configuring Time Zone and Month of 
launch. A subsequent grouping of entry fields 1 103 is provided for 
configuring the Day of Week and the Day of Month for the scheduled 
launch. A subsequent grouping of entry fields 1104 is provided for 
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configuring the hour and minute of the scheduled launch. It is noted herein 
that the options enable a repetitive launch of the same application. Once the 
developer finishes specifying the voice application shell, he or she can click 
a Create Dialog button labeled Create to spawn an overlying browser 
5 window for dialog creation. 

Fig. 12 is a screen shot of a dialog configuration window 1200 
illustrating a dialog configuration page according to an embodiment of the 
invention. In this window a developer configures the first dialog that the 
voice application or Fonelet will link to. A dialog identification section 1201 
10 is provided for the purpose of identifying and describing the dialog to be 
created. A text entry field for entering a dialog name and a text entry field 
for entering dialog description are provided. Within the dialog description 
field, an XML resource tag (not shown) is inserted which for example, may 
refer to a resource label machine code registered with a resource adapter 
1 5 within the application server analogous to adapter 1 13 and application server 
1 10 described with reference to Fig. IB. 

A section 1202 is provided within screen 1200 and adapted to enable 
a developer to configure for expected responses. In this case the type of 
dialog is a Radio Dialog. Section 1202 serves as the business rule logic 
20 control for muhiple choice-like dialogs. Section 1202 contains a selection 

option for Response of Yes or No. It is noted herein that there may be more 
and different expected responses in addition to a simple yes or no response. 

An adjacent section is provided within section 1202 for configuring 
any FoUow-Up Action to occur as the resuh of an actual response to the 
25 dialog. For example, an option of selecting No Action is provided for each 
expected response of Yes and No. In the case of a follow-up action, an 
option for Connect is provided for each expected response. Adjacent to 
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each illustrated Connect option, a Select field is provided for selecting a 
follow-up action, which may include fetching data. 

A Send option is provided for enabling Send of the selected follow- 
up action including any embedded data. A follow-up action may be any type 
5 of configured response such as send a new radio dialog, send a machine 

repair request, and so on. A send to option and an associated select option 
is provided for identifying a recipient of a follow-up action and enabling 
automated send of the action to the recipient. For example, if a first dialog is 
a request for machine repair service sent to a plurality of internal repair 

10 technicians, then a follow-up might be to send the same dialog to the next 
available contact in the event the first contact refused to accept the job or 
was not available at the time of deployment. 

In the above case, the dialog may propagate fi-om contact to contact 
down a list until one of the contacts is available and chooses to interact with 

1 5 the dialog by accepting the job. A follow-up in this case may be to send a 
new dialog to the accepting contact detailing the parameters of which 
machine to repair including the diagnostic data of the problem and when the 
repair should take place. In this example, an option for showing details is 
provide for developer review purposes. Also interactive options for creating 

20 new or additional responses and for deleting existing responses fi'om the 
system are provided. It is noted herein that once a dialog and dialog 
responses are created then they are reusable over the whole of the voice 
application and in any specified sequence in a voice application. 

A section 1203 is provided within screen 1201 and adapted for 

25 handling Route-To Connection Exceptions. This section enables a developer 
to configure what to do in case of possible connection states experience in 
application deployment. For example, for a Caller Reject, Line Busy, or 
connection to Voice Mail there are options for No Action and for Redial 
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illustrated. It is noted herein that there may be more Exceptions as well as 
Follow-up action types than are illustrated in this example without departing 
from the spirit and scope of the present invention. 

A Send option is provided for each type of exception for re-sending 
5 the same or any other dialog that may be selected from an adjacent drop 
down menu. For example if the first dialog is a request for repair services 
and all of the initial contacts are busy for example, the dialog may be sent 
back around to all of the contacts until one becomes available by first 
moving to a next contact for send after each busy signal and then beginning 

10 at the top of the list again on re-dial. In this case John Doe represents a next 
recipient after a previous contact rejects the dialog, is busy, or re-directs to 
voice mail because of unavailability. Section 1203 is only enabled when the 
voice application is set to outbound. Once the first dialog is created and 
enabled by the developer then a second dialog may be created if desired by 

15 clicking on one of the available buttons labeled detail. Also provided are 
interactive buttons for Save Dialog, Save and Close, and Undo Changes. 

Fig. 13 is a screen shot 1300 of dialog design panel 1200 of Fig. 12 
illustrating progression of dialog state to a subsequent contact. The dialog 
state configured in the example of Fig. 12 is now transmitted from a contact 

20 listed in Route From to a contact listed in Route To in section 1301, which is 
analogous to section 1201 of Fig. 12. In this case, the contacts involved are 
John Doe and Jane Doe. In this case, the dialog name and description are 
the same because the dialog is being re-used. The developer does not have 
to re-enter any of the dialog context. However, because each dialog has a 

25 unique relationship with a recipient the developer must configure the 
corresponding business rules. 

Sections 1302 and 1303 of this example are analogous to sections 
1202 and 1203 of the previous example of Fig. 12. In this case if John Doe 
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says no to the request for machine repair then the system carries out a bridge 
transfer to Jane Doe. In the case of exceptions, shown in Route-To 
Connection Exceptions region 1303, all the events are directed to a 
redialing routine. In addition to inserting keywords such as "Yes" or "No" 
5 in the response field 1302, the developer can create a custom thesaurus by 
clicking on a provided thesaurus icon not shown in this example. All the 
created vocabulary in a thesaurus can later be re-used throughout any voice 
applications the developer creates. 

Fig. 14 is a screen shot of a thesaurus configuration window 1400 
10 activated from the example of Fig. 13 according to a preferred embodiment. 
Thesaurus window 1400 has a section 1401 containing a field for labeling a 
vocabulary word and an associated field for listing synonyms for the labeled 
word. In this example, the word no is associated with probable responses 
no, nope, and the phrase "I can not make it". In this way voice recognition 
15 regimens can be trained in a personalized fashion to accommodate for 
varieties in a response that might carry a same meaning. 

A vocabulary section 1402 is provided and adapted to list all of the 
created vocabulary words for a voice application and a selection mechanism 
(a selection bar in this case) for selecting one of the listed words. An option 
20 for creating a new word and synonym pair is also provided within section 
1402. A control panel section 1403 is provided within window 1400 and 
adapted with the controls Select From Thesaurus; Update Thesaurus; Delete 
From Thesaurus; and Exit Thesaurus. 

Fig. 15 is a plan view of a developer's frame 1500 illustrating a 
25 screen 1502 for managing created modules according to an embodiment of 
the present invention. 

After closing all dialog windows fi-ame 1500 displays screen or page 
1 502 for module management options. Menu section 803 is again visible. 
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Screen 1502 displays as a result of clicking on the option "My" or My 
Fonelet in frame 803. Screen 1502 lists all voice applications that are 
already created and usable. In the list, each voice application has a check 
box adjacent thereto, which can be selected to change state of the particular 
5 application. A column labeled Status is provided within screen 1502 and 
located adjacent to the application list appUcations already created. 

The Status column lists the changeable state of each voice 
application. Available status options include but are not limited to listed 
states of Inactive, Activated and Inbound. A column labeled Direct Access 
10 ID is provided adjacent to the Status column and is adapted to enable the 

developer to access a voice application directly through a voice interface in a 
PSTN network or in one embodiment from a DNT voice interface. In a 
PSTN embodiment, direct access ID capability serves as an extension of a 
central phone number. A next column labeled Action is provided adjacent to 
15 the direct access ID column and is adapted to enable a developer to select 
and apply a specific action regarding state of a voice application. 

For example, assume that a developer has just finished the voice 
application identified as Field Support Center (FSC) listed at the top of the 
application identification list. Currently, the listed state of FSC is Inactive. 
20 The developer now activates the associated Action drop down menu and 
selects Activate to launch the application FSC on demand. In the case of a 
scheduled launch, the voice application is activated automatically according 
to the settings defmed in the voice application shell. 

As soon as the Activate command has been issued, the on-demand 
25 request is queued for dispatching through the system's outbound application 
server. For example, John Doe then receives a call originating from the 
voice application server (110) that asks if John wants to take the call. If 
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John responds "Yes," the voice application is executed. The actual call flow 
follows: 

System: "Hello John, you received a fonelet from Jim Doe , would 
you like to take this call?" 
John: "Yes " 

System: "Machine number 008 is broken, are you available to fix it?" 
John: "No." 

System: "Thanks for using fonelet. Goodbye!" 
System: Terminate the connection with John, record the call flow to the 
data source, and spawn a new call to Jane Doe. 

System: "Hello Jane, you received a fonelet from Jim Doe, would 
you like to take this call?" 

Jane: "Yes." 

System: "Machine number 008 is broken, are you available to fix it?" 
Jane: "I cannot make it." 

System: "Please wait while fonelet transfers you to Jeff Doe." 
System: Carry out the bridge transfer between Jane Doe and Jeff Doe. 
When the conversation is completed, terminate the connection with Jeff 
and record the call flow to the data source. 

The default textual content of the voice application is being 
generated by the text-to-speech engine hosted on the telephony or DNT 
server. However, the voice application producer can access the voice portal 
through the PSTN or DNT server and record his/her voice over any existing 
prompts in the voice application. 

It will be apparent to one with skill in the art the method and 
apparatus of the present invention may be practiced m conjunction with a 
CTI-enabled telephony environment wherein developer access to for 
application development is enabled through a cUent application running on a 
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computerized station connected to a data network also having connectivity 
to the server spawning the application and telephony components. The 
method and apparatus of the invention may also be practiced in a system 
that is DNT-based wherein the telephony server and application server are 
both connected to a data network such as the well-known Internet network. 
There are applications for all mixes of communications environments 
including any suitable multi-tier system enabled for VXML and or other 
applicable mark-up languages that may serve similar purpose. 
It will also be apparent to one with skill in the art that modeling voice 
applications including individual dialogs and responses enables any develop( 
to create a limitless variety of voice application quickly by reusmg existing 
objects in modular fashion thereby enabling a wide range of usefiil 
applications from an existing store of objects. 



Auto-Harve.sting Web Data 

In one embodiment of the present invention one or more Websites 
can be automatically harvested for data to be rendered by a VXML engine 
for generating a voice response accessible by users operating through a 
PSTN-based portal. Such an enhancement is described immediately below. 

Fig. 16 is a block diagram illustrating the dialog transition flow of 
Fig. 6 enhanced for Web harvesting according to an embodiment of the 
present invention. Dialog controller 604 is enhanced in this embodiment to 
access and harvest data from an HTML, WML, or other data source such as 
would be the case of data hosted on a Website. An example scenario for this 
embodiment is that of a banking institution allowing all of its customers to 
access their Web site through a voice portal. 
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A Website 1600 is illustrated in this embodiment and is accessible to 
dialog controller 604 via a network access line 1601 illustrated herein as two 
directional lines of communication. The first line is labeled 
Store/Fetch/Input leading from controller 604 into site 1600. The second 
(return) line is labeled Data Return/Source Field. The separately illustrated 
communication lines are intended to be analogous to a bi-directional Internet 
or other network access line. An internal data source (602) previously 
described with reference to Fig. 6 above is replaced in Fig. 16 by Website 
1600 for explanatory purpose only. It should be noted that multiple data 
sources both internal to server 110 and external from server 1 10 could be 
simultaneously accessible to dialog controller 604. 

Website 1600 provides at least one electronic information page (Web 
page) that is formatted according to the existing rules for the mark-up 
language that is used for its creation and maintenance. Site 1600 may be one 
site hosting many information pages, some of which are inter-related and 
accessible through subsequent navigation actions. Controller 604 in this 
embodiment is enhanced for Website navigation at the direction of a user's 
voice inputs enabled by rule accessible by accessing rule engine 603. A data 
template (not shown) is provided for use by dialog controller 604 to 
facilitate logical data population from site 1600. Dialog controller 604 
analyzes both Website source codes and data fields as return data and uses 
the information to generate a VXML page for rendering engine 111. 

It is noted herein that all of the security and access mechanisms used 
at the site for normal Internet access are inferred upon the customer so that 
the customer may be granted access by providing a voice rendering 
(response) containing the security access information. This enables the 
customer to keep the same security password and/or personal identification 
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number (PIN) for voice transactions through a portal as well as for normal 
Web access to site 1600 from a network-connected computer. 

Fig. 17 is a block diagram of the voice application distribution 
environment of Fig. IB Ulustrating added components for automated Web 
harvesting and data rendering according to an embodiment of the present 
invention. In this example, workstation 140 running client software 141 has 
direct access to a network server 1701 hosting the target Website 1600. 
Access is provided by way of an Internet access line 1704. 

It is noted herein that there may be many servers 1701 as well as 
many hosted Websites of one or more pages in this embodiment without 
departing from the spirit and scope of the present invention. A database 
store 1702 is provided in this example and illustrated as connected to server 
1701 for the purpose of storing data. Data store 1702 may be an optical 
storage, magnetic storage, a hard disk, or other forms suitable for storing 
data accessible online. In one embodiment, data store 1702 is a relational 
database management system (RDBMS) wherein a single access may involve 
one or more connected sub servers also storing data for access. 

The configuration of client application 141, workstation 140, server 
1702, Website 1600, and database 1702 connected by network 1704 enables 
Websites analogous to site 1600 to be cuUed or harvested. Application 141 
can read and retrieve all of the default responses that exist for each HTML 
script or scripts of another mark-up language. These default responses are 
embedded into application logic 112 and VXML rendering engine 111. 
Once the content of a Web page has been culled and used in client 141 to 
create the rendering, then VXML engine 1 1 1 can access the Website 
successfiiUy in combination with appUcation logic 1 12 and database/resource 
adaptor 113 by way of a separate access network 1703. For example, ifa 
user (not shown) accesses Website 1600 through voice portal 143 from 
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receiving device 135 (telephone), then he or she would be voice prompted 
for a password to gain access to the site. Subsequently, a voice rendering of 
the data on the site accessed would be recited to him or her over telephone 
135. 

Generally speaking, the development process for a voice portal 
would be the same as was described above with references to Figs. 9-15 
above. Some additional scripting or input of dialog is performed using client 
application 141 . Rather that requiring that the application developer 
populate all of the fields from scratch, or re-apply previously entered 
options, fields used by the business logic as discussed earlier in Figs. 9 
through 15 may be created from information harvested from site 1600 in this 
case. For that purpose, a software adapter (not shown) is added to client 
software 141 that allows it to communicate with Web site 1600 and harvest 
the information, both from the source code comprising fields and labels, etc. 
as well as from data parameters and data variables. 

It is noted herein that the process for data access, retrieval and voice 
rendering is essentially the same with respect to the processes of Figs. 2-5 
above except that a Website connection would be established before any 
other options are selected. 

In one embodiment, provision of connection 1703 between server 
1 10 and server 1701 enables the security environment practiced between 
communicating machines such a secure socket layer (SSL), firewall, etc to 
be applied in the created voice solution for a customer. On the analog side, 
the security is no different than that of a call-in line allowing banking services 
in terms of wiretap possibilities etc. 

It will be apparent to one with skill in the art that the method and 
apparatus of the invention can be practiced in conjunction with the Internet, 
an Ethernet, or any other suitable networks. Markup languages supported 



46- 



5 



include HTML, SHTML, WML, VHTML, XML, and so on. In one 
embodiment, the Websites accessed may be accessed automatically wherein 
the password information for a user is kept at the site itself. There are 
many possible scenarios. 



Prinritizing Web Data for V nif e Renderin2 

According to one aspect of the present invention a method is 
10 provided for selecting and prioritizing which Web data offerings from a 
harvested Web site will be filled into a template for a voice application. 

Fig. 18 is a block diagram illustrating a simple hierarchical structure 
tree of a Web site 1801 and a harvested version of the site 1810. Screen 
1801 illustrates a simple Web site structure tree as might be viewed from a 
1 5 user interface. Selectable icons representing data elements are represented 
herein as solid lines 1802a through 1802n suggesting that there may be any 
number of icons provided within any exemplary Web site. For the purpose 
of this specification, icons 1802a-1802n represent selectable icons, logos, 
hyperlinks and so on. Classifications of each object 1802a-1802n are 
20 illustrated herein as text labels 1803a through 1803n. For example, a 

selectable icon 1802a is one for navigating to the "home page" of the site as 
revealed by adjacent classification 1803a. A subsequent icon (1802b) is a 
login page of the site as revealed by the classification login. In some cases, 
icons and classifications or labels may be one in the same (visibly not 

25 different). 

In this example, the hierarchical structure presents a login block, 
which the user must successfully navigate before other options are 
presented. The presented options Accounts, Status, History, Look-up, 
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Trade, and Quotes are arranged in a hierarchical structure. For example one 
must access Accounts first before options for Status (Accounts/Status) or 
History (Accounts/Status/History) are available to the user. This standard 
structure may be inconvenient and uneconomical for template filling for the 
5 purpose of creating a voice application template for dialog navigation. One 
reason is that the voice application will be created with an attempt to use aU 
of the data of the Web site, which likely will include graphics, charts and the 
like that would not be understood by an accessing user if the description is 
simply translated and recited as a voice dialog over the telephone. Another 
10 reason is that the generic hierarchy of Web site structure 1801 may not be of 
a desired hierarchy for rendering as voice dialog in a request/response 
format. Typically then, certain data will be valuable, certain data will not be 
valuable, and the order data is presented at the dialog level will be important 
to the user as well as to the administrator (service provider). 
1 5 Screen 1810 represents the same structure of screen 1801 that has 

been completely harvested wherein all of the icons and elements identified in 
source code of the site have been obtained for possible template filling. It is 
noted that the template enables a voice application to operate in the goal of 
obtaining and rendering updated data according to the constraints 
20 established by an administrator. Web site 1810 is pre-prepared for template 
filling. Icons are labeled 1812a through 1812n and classifications are labeled 

1813a through 1813n. 

Object 1810 is generated to emulate the generic structure of the Web 
site including graphics, charts, dialog boxes, text links, data fields, and any 
25 other offered feature that is present and enabled in the HTML or other 
language of the site. Because of the mitigating factors involved with a 
potentially large number of users accessing a voice portal to receive dialog, 
much streamlining is desired for user convenience as well as network load 
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stabilization. Therefore, an interaiediate step for object modeling elements 
and reorganizing the tree hierarchy is needed so that a voice appHcation 
template can be filled according to a desired selection and hierarchy thus 
facilitating a more economic, optimized construction and execution of a 
resulting voice application. 

The object modeling tools of the invention can be provided as part of 
client application 141 described with reference to Fig. IB above. Created 
objects organized by hierarchy and desired content can be stored in 

application server 110 described with reference to Fig. 6 above or in a local 

database accessible to voice application server 1 10. 

Fig. 19 is a block diagram illustrating the Web site structure 1801 of 

Fig. 18 and a Web site object created and edited for template creation. 

Screen 1801 is analogous to screen 1801 of Fig. 18 both in element and 

description thereof; therefore none of the elements or description of the 

elements illustrated with respect to structure 1801 of Fig. 18 shall be 

reintroduced. 

Screen 1910 represents a harvested Web site that started out with 
structure 1801, but has since been reorganized with element prioritization for 
the purpose of populating a voice appUcation template in an optimized 
fashion. It can be seen in this example, that significant editing has been 
performed to alter the original content and structure of the harvested Web 
site. Icons 1912a through 1912n illustrated the icons that have been retained 
after harvesting. 1913a through 1913n represent the classifications of those 
objects. Firstly, an optimization is noted with respect to icons labeled Home 
and Login in structure 1801. These items in harvested object 1910have 
been optimized through combination into one specified object labeled login 
and given the element number 1913a. In this case Account Status and 
History is streamlined to Balance the most valuable piece and the most 
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commonly requested information. Also in this case any charts, graphs or 
other visuals that may not be understood if rendered as a voice dialog are 
simply eliminated from the voice application template. The intermediate 
step for organization before template filling would be inserted in between 
5 steps of harvesting the Web site data and populating the voice application 
header. 

After successful login, wherein the user inputs a voice version of the 
PINAJser Name/Password combination and is granted access to the voice 
application from a voice portal, the next priority in this example is to enable 

10 the user to quickly determine his or her account balance or balances. 
Element numbers 1912b and 1912c represent 2 balances assuming 2 
accounts. There may be more or fewer prioritized icons without departing 
from the scope of the invention. In this case, the first "voice option" 
provided through the optimization process is to have account balances 

1 5 recited by telephone to the participating user. The other present and offered 
options of Look-up, Trade, and Quote, illustrated herein by element numbers 
1913c through fare moved into a higher but same level of architecture or 
structure meaning that they are afforded the same level of importance. All 
three of these options are related in that a user request or response 

20 containing stock symbol information can be used to initiate any of the 
actions. 

Fig, 20 is a process flow diagram illustrating added steps for 
practicing the invention. At step 2000, an administrator operating client 
application 141 described with reference to Fig. 17 above harvests the Web- 
25 site for source data and data structure. At step 2001, the administrator 

creates an editable object representing the existing structure hierarchy of the 
target Web site. The object tree has the icons and associated properties and 
is executable when complete. In one embodiment, many of the standard 
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icons and properties shared by many Web sites are provided for the 
administrator so that simple drag and drop operations can be used to create 
the tree. If a developer has to create a specific object fi-om scratch, the 
source mark-up language can be used to construct the object from object 
building blocks representing object components. The new objects can then 
be saved to storage and re-used. 

In one embodiment, rendering the source description as instruction to 
a modeling engine automatically creates the object tree. In this case, the 
harvested object is presented to the administrator as harvested and "ready to 
edit" wherein steps 2000 and 2001 are largely if not completely transparent 
to the administrator. In another embodimem, the administrator simply drags 
and drops icons using a mouse provided with the workstation employed to 
do the modeling. 

At step 2002, the administrator may edit some objects to make them 
fit the constraints of VXML voice rendering more completely. In the same 
step he or she may delete certain objects from the tree altogether. Still 
further in the same step the administrator may move and group objects 
according to priority of rendering. If a Web site contains a login 
requirement it will, of course, be the highest priority or the first executable 
dialog of the resulting voice application. Complicated logins may be 
simplified. Moreover one or more objects can be combined to be rendered 
in a same dialog. There are many possibilities. 

In still another embodiment, an object tree may be flattened to one 
level or an object tree may be expanded to contain more levels. The 
administrator may also insert coment (rendered to dialog) that was not 
originally available from the Web site. The new content may be placed 
anywhere in the object tree and will subsequently take its place of priority in 
the resulting dialogs of the voice application. Once the voice application is 
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complete, the initiation and execution of the application lends to data access 
and retrieval of any new data at the site. A standard navigation template is 
used to access the site and data is retrieved only according to class of data 
identified in the object tree. In this way unwanted data is not repeatedly 
5 accessed multiple times from a same Web site. 

In step 2003, the voice application template is populated as described 
above. At step 2004, the administrator can begin to parameterize the voice 
application execution including establishment of all of the CTI contact 
parameters. At step 2005, the administrator can create dialog. 

10 It will be apparent to one with skill in the art that pre-organizing Web 

harvested content for voice rendering is an extremely useful step for 
reducing complexity, reducing network and processor load, and for 
providing only pertinent and useful voice renderings to users accessing or 
contacted in the sense of outbound dialing from a connected voice portal 

1 5 system. 

Enhanced Security 

Fig. 21 is a block diagram illustrating a secure connectivity between a 
20 Voice Portal and a Web server according to an embodiment of the invention. 

The connection scheme illustrated in this example connects a user 
(not shown) accessing a voice portal 2106 wherein portal 2106 has network 
access to Web-based data illustrated herein within Internet 2108, more 
particularly from a Web server 2109 connected to a database 2110. 
25 Voice portal 2 1 06 comprises a voice application server (VAS) 2 1 03 

connected to an XML gateway 2104 by way of a data link 2105. In this 
embodiment, data hosted by server 2109 is culled there from and delivered 
to XML gateway 2104 by way of line 2107. Application server 2103 then 



generates voice applications and distributes them to users having telephone 
connection to PSTN 2101 . Telephony switches, service control points, 
routers and CTI-enabled equipment known to telephony networks maybe 
assumed present within PSTN 2101. SimUarly, routers servers and other 
5 nodes known in the Internet may be assumed present in Internet 2108. The 
inventor deems the illustrated equipment sufficient for the purpose of 
explanation of the invention. 

Typically, a voice access to voice portal 2103 fi-om anyone within 
PSTN 2101 may be assumed to be unprotected whether it is an inbound or 
1 0 an outbound call. That is to say that anyone with a telephone line tapping 
capability can listen in on voice transactions conducted between users 
phones and the voice application server. Typically, prior art conventions 
with phone transactions such as IVR entry of social security and PIN 
identification is sufficient to access account information. However, anyone 
15 else with the same information can also access the user' s automated account 
lines to fmd out balance information and so on. 

Server 2109 may be protected with Web certificate service wherein a 
user (on-line) accessing any data from server 2109 must send proof of 
acceptance and signature of the online authentication certificate. These 
20 regimens are provided as options in a user's Browser application. 

One way to extend security to the point of XML gateway 2104 is 
through a completely private data network. A less expensive option is a 
VPN network as is iUustrated in this example. Another way is through SSL 
measures such as HTTPS. Any of these methods may be used to extend the 
25 security regimens of server 2109 to Voice portal 2106. In this embodiment, 
gateway 2104 is adapted to operate according to the prevailing security 
measures. For example, if a user goes online to server 2109 changes his or 



-53- 

her password information and signs a Web authentication certificate, the 
same change information would be recorded at the voice portal. 

The only security lapse then is between a user in the PSTN and 
portal 2106. Information sent as voice to any user and response voice sent 

5 from any user can be obtained by tapping into line 2102. One possible 
solution to protect privacy to some extent would be to use a voice 
translation mechanism at the voice portal and at the user telephone. In this 
way, the voice leaving the portal can be translated to an obscure language or 
even code. At the user end, the device (not shown) translates back to the 

10 prevailing language and plays on a delay over the telephone speaker system. 

One with skill in the art will recognize that an additional advantage of 
using the existing security, VPN, SSL, etc. is that the security system has 
already been tested, and is being constantly improved. One with skill in the 
art will also recognize that many variations can be provided without 

15 departing from the spirit and scope of the invention. For example outsource 
WEB hosting may be used. Multi site WEB systems can be used for 
redundancy. Outsourced Voice services or multi service/location voice 
services may also apply. 

20 Vocabulary Management for Recognition Options 

According to yet another aspect of the invention, the inventor 
provides a vocabulary management system and method that enhances 
optimization of voice recognition software. The method and apparatus is 
25 described in the enabling disclosure below. 

Fig. 22 is a block diagram illustrating the architecture of Fig. IB 
enhanced with a vocabulary management server 2200 and software 2201 
according to an embodiment of the present invention. 



-54- 

The system architecture of this embodiment is largely analogous to 
the architecture discussed with reference to Fig, IB above. Therefore, 
elements present in both examples Fig. IB and Fig. 22 shall not be 
reintroduced unless modified to practice the present invention. 
5 Vocabulary management server 2200 is adapted with an instance of 

vocabulary management software (VMS) 2201 for the purpose of tailoring 
voice recognition template options to just the required vocabulary to fully 
enable the instant voice application. 

Server 2200 may be presumed to have a data storage facility 
10 connected thereto or held internally therein adapted for the purpose of 

warehousing and organizing data. With regard to harvesting Web data and 
using the harvested Web data as source data for voice dialog as described 
further above with reference to the example of Fig. 17, the Web-based 
components are represented in this embodiment by Internet access lines, one 
15 connected from workstation 140 giving it Web access and another 
connecting voice apphcation server 1 10 giving it access through 
database/resource adapter 113. In this way. Web-access to any targeted 
Web-based data for auto harvesting, interpretation, and translation to voice 
dialog is assumed. 

20 Server 2200 can be accessed from workstation 140 running client 

application 141 through voice application server 2202 or more particularly 
through database resource adapter 113 over a data link 2203. In this way, 
an administrator can set-up and manipulate vocabulary options attributed to 
specific on-line or off-line (internal) data sources. 

25 VMS software 2201 is adapted to enable separate and segregated 

sets of vocabulary specific to certain target data accessed and function 
allowed in conjunction with the target data. In one embodiment, additional 
subsets of vocabulary of a same target data source can be provided that are 
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further tailored to specific clients who access the data through interaction 
from portal 143 over PSTN 134. Rule sets specific to the created 
vocabulary sets are created and tagged to the specific vocabulary sets and 
provided to application logic 112. 
5 VXML compliant telephony server 130 has a text-to-speech and a 

speech-to-text capable engine 2205 provided therein as an enhanced engine 
replacing engine 132 described with reference to Fig. IB. In one 
embodiment the separate functions may be enabled by separate components. 
The inventor illustrates a single engine with dual capabilities for illustrative 

10 purpose only. Engine 2205 has access to vocabulary management server 
2200 through a data link 2202. 

Server 2200 is accessible from application logic 1 12 of voice 
application server 1 10 by way of a data link 2204 and from database 
resource adapter 1 13 by way of a data link 2203. In one embodiment, a 

1 5 single data link is sufficient to enable communication between the just- 
mentioned components in voice application server 100 and server 2200. 

In practice of the invention, assuming a Web-based data source is 
accessed, the voice recognition operates in a different way from previously 
described embodiments. For example, assume a client is accessing voice 

20 portal 143 in PSTN 134 from telephone 135 to interact with his or her 
personal investment Web page that contains option for account balance 
rendering and for stock trading. A specific vocabulary for the target Web 
site is available in server 2200 managed by VMS 2201. Perhaps a sub-set of 
the vocabulary particular to the client also exists and is organized under the 

25 parent vocabulary set. 

Telephony server 130 recognizes the accessing user and an existing 
voice application is triggered. Voice application server 2202 connects to the 
Web site on behalf of the user through database resource adapter 1 13 and 
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the Internet access line. Following the constraints of the voice application 
template, the database resource adapter provides the user login and 
password information after the user conununicates these in the first or 
opening dialog and then gets the account data and any other updated data 
5 that the user is entitled to. The first dialog response rendered to the user 

from the voice application may contain only the stock values pertinent to the 
user account and the existing monetary balances associated with the specific 
symbols. While there may be more information available to the user, some 
of the available information may not be pertinent to or useful to the user. 
10 Therefore, before each dialog rendering, VMS 2201 provides the 

appropriate vocabulary and rule set for the particular dialog function, in 
some cases particular as well to the accessing user. Therefore, voice 
recognition soflware is not required to search a large vocabulary to 
intemperate the rendered VXML page. In this case, the VXML page itself is 
15 limited by the vocabulary management fijnction before it is delivered to 
telephony server 130. 

In another embodiment, intervention from VMS 2201 may occur 
after the standard VXML page is rendered but before voice recognition 
begins in server 130. In this case, engine 2205 consults server 2200 to 
20 obtain the appropriate vocabulary constraints. In this example data not 

recognized from VXML is simply dumped. There are many differing points 
along the dialog process where VMS 2201 may be employed to streamline 
the voice recognition function. For example, in the first dialog response 
described fiarther above, the user may be prompted to initiate any desired 
25 trading activity. If the user elects to do some trading then the speech to text 
portion of engine 2205 may consult VMS 2201 for a limited trading 
vocabulary that is tailored to that client. Such a vocabulary may be 
expanded for a different client that is, for example, a VIP and has, perhaps 
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more allowable options. Voice renderings from the client that do not match 
the provided vocabulary and/or do not conform to the rules are ignored. 

In addition to personalizing and streamlining vocabulary options for 
voice recognition, an administrator can use VMS to create new vocabulary 

5 and/or to create a plurality of synonyms that are recognized as a same 

vocabulary word. For example, an administrator may configure stock, share, 
and security as synonyms to describe paper. Sell, short, and dump may all be 
understood as synonyms for selling paper. There are many variant 
possibilities. In general, VMS 2201 can be applied in one communication 

10 direction (from service to user) as a management tool for limiting data on a 
VXML page for rendering, or for limiting voice recognition of the VXML 
page and dumping the unrecognized portion. VMS 2201 can be applied in 
dialog steps in the opposite direction (from user to service) to tailor voice 
recognition options allowed for a user or a user group according to service 

1 5 policy and constraint. 

In an embodiment where VMS 2201 works only with the VXML 
stream, it may be located within application server 1 10 or within telephony 
server 130. It is conceivable that diflferent dialogs (both initial and response 
dialogs) of a same voice application for a same client accessing a single data 

20 source can be constrained using different vocabulary sets using VMS 2201 . 
Therefore the optimum level of management capability is at the level of 
action/response. By limiting the work of voice recognition processing at 
every available step during interaction, much processing power and 
bandwidth can be reserved for other uses. 



25 
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Local Cache Optimization (static, dynamic) 

In yet another aspect of the present invention a method and 
5 apparatus for reducing data traffic is provided that uses local cache 
optimization in a VXML distribution environment. 

Fig. 23 is a block diagram illustrating various functional components 
of a VMXL application architecture 2300 including cache optimization 
components according to an embodiment of the present invention. Fig. 23 is 
10 quite similar to Fig. 1, except updated and showing additional detail. 

Architecture 2300 comprises a voice application server 2301, and a 
telephony server/voice portal 2302 as main components. Portal 2302 
comprises a speech generator 2306 and a telephony hardware/software 
interface 2305. Portal 2302 is VXML-compliant by way of inclusion of a 
1 5 VXML interpreter 2307 for interpreting VXML data sent thereto from 

application server 2301. Voice portal 2302 is maintained as an access point 
within a telephony network such as the well-known PSTN network. 
However, portal 2302 may also be maintained on a wireless telephony 
network. 

20 A Web interface 2303 is illustrated in this example and serves as an 

access point from the well-known Internet or other applicable DPN. Voice 
portal 2302 may represent a CTI-enhanced IVR system, customer service 
point, or any other automated voice portal system. In the case of a Web- 
based portal, component 2303 may be a Web server, a computer connected 

25 to the Internet, or any other type of node that provides a user interface. 

Voice application server 2301 is similar in many respects to voice 
application 2202 described with reference to Fig. 22. In this regard, voice 
application server has voice application development software (VADS) 2308 
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installed and executable thereon. VADS 2308 illustrated within the domain 
of voice application server 2301 has certain modules that shall herein be 
described using labels and shall not have element numbers assigned to them 
because of limited drawing space. Modules illustrated in VADS 2308 
include a contact manager (Contact Mgr.) instance adapted as a developers 
tool for managing the parameters of dialog recipients. A dialog controller 
(Dialog Ctrl.) is provided as a developer tool for creating and managing 
voice apphcation dialogs and for initiating interface operations to rules 
sources and internal/external data sources. A Fonelet controller (Fonelet 
Ctrl.) is provided within VADS 2308 and adapted to control the distribution 
of subsequent dialogs of a voice application. An XML generator (XML 
Gen.) is provided within VADS 2308 and adapted to generate XML for 
VMXL pages. 

Voice application server 2301 has application logic 2309 provided 
therein and adapted to control various aspects of application delivery, 
creation, and management. Application logic 2309 includes a rule manager 
(Rule Mgr.) for providing the enterprise rules for application creation and 
deployment via the contact manager and dialog controUer referenced above, 
and rules for ongoing user and system interactions with running applications. 
A dialog runtime processor (Dialog Run T. Prcsr.) is provided and adapted 
to control the way a completed dialog of a voice application is launched and 
formatted. A Fonelet runtime processor (Fonelet Runtime Prscsr.) is 
provided within application logic 2309 and controls various and sundry 
aspects of how voice applications (Fonelets) are executed and 
choreographed in real time. A dynamic grammar generator (Dynamic 
Grammar Gen.) is provided within application logic 2309 and is adapted to 
generate grammar keywords in association with non-recurring dialog content 
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wherein the user, to retrieve instant results in a dynamic fashion, can speak 
the generated keywords. 

New components not before introduced within the appUcation logic 
in server 2301 are a static optimizer 23 12, and a dynamic optimizer 23 1 1 . 
5 The goal of the present invention is to optimize reduction of data traffic 
between portals 2302 and 2303 (if Web enabled) and voice application 
server 2301 . Accomplishing a reduction in data traflfic between the voice 
application server and voice portals is especially important where the 
components are remote from one another and connected through relatively 
10 narrow data pipelines. Such pipelines can become bottled up with data at 
peak performance periods during operation causing a notable delay in 
response time at the voice portals. More detail about optimizers 23 12 and 
23 1 1 and their relationship to the dialog runtime processor will be provided 
later in this specification. 
1 5 Server 2301 has a data/resource adapter block 23 10 that contains all 

of the required modules for interfacing to external and to internal data 
sources. For example, an application manager (App. Mgr.) is provided 
within adapter 23 10 and is adapted as a main interface module to user-end 
systems such as portals 2302 and 2303. The application manager provides 
20 the appropriate data delivery of dialogs in order of occurrence, and in a 
preferred embodiment of the invention delivers static and dynamic dialog 
pieces (determined through optimization) for storage to one or more cache 
systems local to the user's end system. More about the role of the 
application manager will be provided further below. 
25 A report manager (Report Mgr.) is within adapter 23 10 and is 

adapted to work with the application manager to provide reportable statistics 
regarding operation of voice application mteractions. Report manager tracks 
a Fonelet (voice application) until it is completed or terminated. 



-61- 

Background statistics can be used in the method of the present invention to 
help determine what dynamic (non recurring) dialog pieces of a voice 
application should be cached locally on the user-end. 

A third-party Web-service provider 23 13 is illustrated in this example 
5 as external to server 2301 but linked thereto for communication. Third- 
party service 23 13 represents any third-party service provider including 
software that can be used to tap into the voice application development and 
deployment services hosted within server 2301 . Thin software clients 
licensed by users fall under third-party applications as do Web-based services 

10 accessible to users through traditional Web sites. To facilitate third-party 

connection capability, server 2301 has a Web resource connector (Web. Res. 
Conn.) that is adapted as a server interface to third-party functions. A 
Fonelet event queue (Fonelet Event Queue) is provided within adapter 23 10 
and is adapted to queue incoming and outgoing Fonelet (voice application) 

1 5 events between the server and third-party-provided resources. A Fonelet 
XML interpreter (Fonelet XML Int.) is provided within adapter 23 10 and 
adapted to interpret XML documents incoming to or outgoing from the 
Fonelet event queue. 

A resource manager (Resource Mgr.) is provided within adapter 

20 23 10 and is adapted to manage access to all accessible resources both 

external and internal. It is noted that internal resources may be maintained 
within the server itself, or within a domain of the server, the domain 
including other systems that may be considered within the domain such as 
internal data systems within a contact center hosting the voice application 

25 server, for example. A database access manager (Database Access Mgr.) is 
provided within adapter 23 10 and is adapted to facilitate data retrieval from 
persistent data storage provided and associated with data stores located 
internally to the domain of server 2301 . 
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A VXML rendering engine 23 14 is provided within application 
server 2301 and is adapted to render VXML pages in conjunction with the 
dialog controller in VADS 2308. Rendering engine 23 14 is analogous to 
engine 1 1 1 described with reference to Fig. 22 and Fig. 6 above. 

5 Server blocks 23 10, 2309, 2308, and engine 23 14 communicate and 

cooperate with one another. Communication and cooperation capability is 
illustrated in this example by a logical sever bus structure 23 15 connecting 
the blocks for communication. A similar logical bus structure 23 16 is 
illustrated within portal 2302 and connects the internal components for 

10 communication. 

As previously described above, a voice application, once launched 
comprises a series of interactive dialog pieces that produce both static and 
dynamic results. For example, a company greeting that is played to every 
caller is considered a static greeting because there are no dynamic changes in 

15 the dialog from caller to caller. However, a dialog response to a user- 
request for a stock quote is considered dynamic because it can vary from 
caller to caller depending on the request. Similarly, data results pulled from 
a database or other external data source that are embedded into response 
dialogs cause the dialogs themselves to be considered dynamic because, 

20 although the basic template is static the embedded results can vary between 
callers. 

Static optimizer 23 12 and dynamic optimizer 23 1 1 are provided to 
work in cooperation with the dialog runtime processor to identify pieces of 
dialog that should be distributed to end system cache storage facilities for 
25 local access during interaction with an associated voice application. 

Optimizers 23 12 and 23 1 1 are software modules that monitor and read 
dialog files during their initial execution or when the associated voice 
application is modified. Static optimizer 23 12 cooperates with the rule 
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manager and tags, according to business rule, certain files that can be labeled 
static or recurring files that do not change fi-om caller to caller. Dynamic 
optimizer 23 1 1 cooperates with the rule manager and tags, according to 
business rule, certain files that are non-recurring fi-om caller to caller, but are 

5 repeated ofl:en enough to warrant distributed caching to a cache local to an 
end system through which the associated voice application is accessed. 

In one embodknent, optimizers 23 12 and 23 1 1 are embedded 
modules running within the dialog runtime processor. In another 
embodiment, the optimizers are separate modules that are activated by the 

10 runtime processor when it processes dialogs of a particular voice application. 

When an administrator changes a voice application, or when a new 
voice application is created, then optimization processes of optimizers 23 1 1 
and 23 12 are invoked to determine which data out of the appUcation flow 
needs to be cached. Tagging can take the form of various file identification 

1 5 regimens known in the art. In a preferred embodiment, standard HTTP 1 . 1 
tagging is used. The optimizing components 23 12 and 23 1 1 can either add 
tags to untagged files, or, in some cases remove tags from already tagged 
files. This automated process allows an administrator to create dialogs 
without worrying about distribution issues that are associated with data 

20 traffic between servers. 

For static files, optimizer 23 12 identifies which files to cache at an 
end system, tags them appropriately and prepares the tagged files for 
distribution to identified end-system cache. In the case of portal 2302 being 
the end system, the static files of a voice application would be stored locally 

25 in block 2305 in server cache. In one embodiment, the distributed static files 
are cached at a first deployment of a recently modified or brand new voice 
application. The first consumer to access the application will not experience 
any optimum performance due to the fact that the static files are cached 
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during the first interaction. However, a subsequent consumer accessing the 
application from portal 2302, or a first caller that repeats the static portion 
of the application will experience a performance increase because the 
telephony server will access and serve the static portion of the application 

5 from local cache instead of retrieving the dialogs from application server 

2301 every time they are requested. It is noted herein that caching static and 
dynamic content is temporary in a preferred embodiment. That is to say that 
when a voice application is no longer used by the enterprise, or is replaced 
by a new application, the unnecessary files are deleted from the cache 

10 systems. 

Once static dialogs from voice applications are distributed to and 
cached within the telephony server portion of portal 2302, they can remain in 
cache for subsequent retrieval during subsequent interaction with associated 
voice applications. However, if a voice application is subsequently modified 

15 by an administrator and different dialogs are now identified as static 

cacheable dialogs, then those dialogs aheady cached will be replaced with 
the newer updated static dialogs. Any common form of identification and 
revision strategy can be used to synchronize the appropriate static files. 
Some dialogs may simply be dropped from an application being modified 

20 while other static dialogs may be newly added. In these instances of 

subsequent application modification concerning the presence of new, deleted 
or modified files that are deemed static, the synchronization of these files 
with those already stored can take place before an application is scheduled to 
be deployed to the end system, or during runtime of the application. 

25 In a preferred embodiment of the invention caching of dynamic files 

is performed in the voice Web controller module within telephony 
software/hardware block 2305 of portal 2302. Dynamic files are different 
than static files as dynamic files do not have to be retrieved during every 
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execution and interaction with a voice application. Therefore, dynamic 
retrieval occurs only after user interaction v^^ith a voice application has 
begun. Statistical analysis can be used at voice application server 2301 to 
determine over several voice application deployments, which files make 
sense to continue to distribute to end-system cache facilities and, in some 
cases which files already cached for dynamic optimization should be deleted 
and subsequently removed fi-om end-system local access. 

Fig. 24 is a process flow diagram illustrating steps for practice of the 
present invention in a particular embodiment. At step 2400a, a static 
greeting message is played such as "thank you for calling XYZ corporation". 
Once a voice application containing this dialog has been accessed from an 
end system, the particular dialog is stored locally if it is identified as a static 
dialog. Each time a subsequent access is made to the same voice 
application, greeting 2400a is pulled fi-om local cache in step 2401 when 
ordered. 

At step 2400n a last static message is played, which in this 
embodiment represents a menu message. It will be appreciated that there 
may be multiple static dialogs in a voice application as indicated in this 
example by the element assignment of 2400a-n in this example. Each time 
any static message 2400a-n is required in the voice application execution, it 
is pulled from local cache in step 2401 . The message played at step 2400n is 
a precursor to interaction such as " We have changed our menu. Please 
listen carefiiUy. Your phone call may be recorded for training purposes." 

Because messages 2400a-n are played at the beginning part of, for 
example, an IVR interaction regardless of who the caller is, they can be 
statically cached within the telephony server representing the accessed end 
system or application consumer. As previously described above, HTTPl . 1 
standard tags may be used to indicate which material to cache. The local 
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server keeps the static files in store and uses them according to the 
appropriate application flow whenever a call comes in to the number or 
extension of that particular voice application. In some cases voice 
applications will be numerous at a single contact number with extensions 
separating them for access by callers. 

Without local caching of the static content, then the telephony server 
would typically make a request to the Web controller, which would then 
send a request to the runtime processor and fetch the message fi-om the 
dialog runtime processor. The sound file would be sent fi-om the processor 
back over the same network comiection to the telephony server for instant 
play. It will be appreciated that local caching of dialog portions of a 
dynamic interactive voice application save significant bandwidth between the 
portal and the application server. Examples of other types of static dialogs 
that may be cached locally to an end-system include hours of operation, 
location or driving instructions, billing address, and so on which, in essence, 
never change dynamically. 

At step 2402, a user interacts with the voice application by initiating 
a selection resulting from the menu option dialog of step 2400n. At step 
2403a a dynamic menu option or result is played. The option or result is 
retrieved as a result of the user-initiated selection or interaction to a previous 
static dialog. Therefore the next dialog the user hears is considered non- 
recurring or dynamic. This means that the result or menu option can vary in 
content from call to call, the variance ordered by the first user interaction 
with the voice application. 

The rules that will govern whether or not to distribute a dialog to the 
local cache of an end-system through which a particular voice application is 
accessed can vary according to content, number of possible options or 
results, and in some cases statistical probability. For example, if a voice 
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application is created for a banking institution wherein a dynamic menu has 
options for being transferred to a loan officer, a standard teller, or an 
automated account attendant, and statistically, 90% of all callers choose the 
transfer to the automated attendant, then the subsequent beginning dialog of 

5 the voice application associated with automated banking can be cached 
locally. In this case, the first 2 options request a live connection thereby 
terminating the voice application. The 3'"* option links to another dialog of 
the same application or to another application entirely. It will follow then 
that the next dialog may be static because it merely asks the caller to enter 

10 identification criteria. It is the same dialog for all callers who select 
"automated attendant". 

It is noted that criteria for dynamic optimization may vary widely. 
For example, personal information results embedded into a standard dialog 
template must be retrieved from the data sources of the institution and 

1 5 cannot be locally cached. However, the standard menu soliciting the 

interaction resuhing in data fetch of personal information can be cached 
locally. 

Dialogs that are assigned to dynamic caching are retrieved from a 
Web controller in step 2403 each time they are selected. Moreover, step 

20 2402 may occur repeatedly between dynamically cached dialogs. At step 

2403 n, a last dynamic menu option is played in a voice application sequence. 
It may be that statistically only a few users navigate to the end of the voice 
application or last menu. Therefore it may not be considered for local 
caching. However, many standard dynamic options and results can be 

25 dynamically cached in the event that probability is high that a large number 
of callers are going to request the option or result. 

Results that typically are not fluid such as, perhaps the desired model 
and make of a product are dynamic results because there are other results 
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available for return through interaction with the interactive menu. The most 
popular results can be dynamically cached as dialogs that can be retrieved 
locally even though every caller will not interact with the same result. 
Optimizers share database accessibility with all of the other modules 
5 described with respect to the application server of Fig. 23. Therefore, 

results that are commonly requested, although not completely static can be 
embedded into the dialog template and saved locally as a voice application 
dialog linked through to a certain selection made as a response to a previous 
dialog of the same application. 
10 In some cases of dynamic caching, the standard dialog is there 

without the embedded results, which are dynamic. In this case, a client 
application can be provided that retrieves the requested data using the voice 
application server as a proxy and embeds the data into the template locally to 
the user wherein after the user has accessed the data and moved on in the 
1 5 appUcation, the embedded data is then deleted from the template until the 
next invocation. There are many possibilities. 

It will be apparent to one with skill in the art that the method and 
apparatus of the invention can be applied to access of both internal data 
sources as well as external data sources wherein some of the external data 
20 sources are network-based data sources analogous to Web-hosted data and 
data available over other types of digital data networks. 

The method and apparatus of the invention should be afforded to 
broadest interpretation under examination in view of the many possible 
embodiments and uses. The spirit and scope of the invention is limited only 
25 be the claims that follow. 
What is claimed is: 



